feat: content blocking detection for headless browsers by avifenesh · Pull Request #73 · agent-sh/web-ctl

avifenesh · 2026-02-26T01:26:38Z

Summary

Add detectContentBlocked() function to detect when sites serve pages but block content from headless browsers (e.g., X.com empty timelines)
Enhance browser stealth with anti-bot evasion (window.chrome, navigator.plugins, WebGL, permissions.query spoofing)
Integrate content-blocked warning into goto action with --no-content-block-detect flag
Add X.com provider config with content selectors and blocking indicators

Test Plan

541/541 tests passing (33 new tests added)
Content blocking detection covers: provider-specific selectors, text patterns, empty content, generic patterns, persistent spinners
X.com-specific test scenarios: empty feed, error state, no false positive with real content
Edge cases: empty contentSelectors array, page query errors, threshold boundaries
Integration tests verify goto action wiring

Related Issues

Closes #38

Add contentSelectors and contentBlockedIndicators fields to the X (Twitter) provider entry. These define the DOM selectors and text patterns used to detect when X.com blocks headless browsers from viewing feed content. Updated notes to document blocking behavior.

Add a new detectContentBlocked() function that detects when sites serve pages but block actual content from headless browsers. Uses five ordered heuristics (OR logic): provider blocked selectors, provider blocked text patterns, empty content areas, generic error text with short body, and persistent loading indicators. Exports CONTENT_BLOCKED_TEXT_PATTERNS.

Expand the addInitScript block with additional stealth measures: - Spoof window.chrome object (present in real Chrome, missing in headless) - Spoof navigator.plugins with non-empty PluginArray-like object - Set navigator.languages to ['en-US', 'en'] - Override WebGL vendor/renderer to Intel Inc. / Intel Iris OpenGL Engine - Override permissions.query for 'notifications' to return denied state

Import detectContentBlocked and add matchProviderByDomain helper with lazy-loaded Map for O(1) provider lookup. After goto navigation and waitForLoaded, detect content blocking using provider-specific config from providers.json. When detected, add contentBlocked, warning, reason, and suggestion fields to the result. Add --no-content-block-detect flag to skip detection.

Add 19 tests covering all detection heuristics: - Provider-specific blocked selectors and text patterns - Empty content area detection with threshold - Generic error text with short body - Persistent loading indicators (visible vs invisible) - Error handling for page.$() and textContent() failures - Default emptyContentThreshold of 200 - X.com-specific: empty feed, error state, no false positives - CONTENT_BLOCKED_TEXT_PATTERNS export validation

…xports Add two tests to the existing auth-wall-detect test suite confirming that the new detectContentBlocked function and CONTENT_BLOCKED_TEXT_PATTERNS constant are properly exported from the module.

…e case tests - Cache bodyText fetch in detectContentBlocked to avoid redundant DOM query - Export LOADING_INDICATOR_SELECTORS for testability - Add empty contentSelectors array edge case test - Add LOADING_INDICATOR_SELECTORS validation tests

avifenesh added 9 commits February 26, 2026 03:06

test: verify detectContentBlocked and CONTENT_BLOCKED_TEXT_PATTERNS e…

7c71867

…xports Add two tests to the existing auth-wall-detect test suite confirming that the new detectContentBlocked function and CONTENT_BLOCKED_TEXT_PATTERNS constant are properly exported from the module.

test: add integration tests for content-block detection in goto

8eb272d

docs: document content blocking detection in goto action

117ffc1

avifenesh merged commit f02e6a9 into main Feb 26, 2026
2 of 3 checks passed

avifenesh deleted the feature/auth-wall-detection-38 branch February 26, 2026 01:32

This was referenced Feb 26, 2026

X.com feed blocks headless browsers - use as test case for auth wall detection #38

Closed

feat: headless stealth hardening + auto headed fallback #74

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: content blocking detection for headless browsers#73

feat: content blocking detection for headless browsers#73
avifenesh merged 9 commits intomainfrom
feature/auth-wall-detection-38

avifenesh commented Feb 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

avifenesh commented Feb 26, 2026

Summary

Test Plan

Related Issues

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant